Exploting multiple heterogeneous data sets for improving geotagging quality

نویسندگان

  • Laura Di Rocco
  • Roberto Marzocchi
  • Tiziano Cosso
  • Barbara Catania
  • Giovanna Guerrini
چکیده

Geotagging is the process of associating with textual data items the geographic position they denote, usually in the form of geographical coordinates (latitude and longitude). Automatic geotagging is often trivial relying on one of the many available gazetteers, such as OpenStreetMap (OSM). However, such knowledge bases are not free of errors, and, while this simple match works for popular locations, automatic annotation of less relevant venues and events may be significantly inaccurate. The goal of this work is to increase geotagging quality (in terms of completeness and accuracy) by also identifying and jointly exploiting diverse data sources as gazetteers. This will also allow us to cope with ambiguity by additionally performing semantic queries in various open knowledge bases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A committee machine approach for predicting permeability from well log data: a case study from a heterogeneous carbonate reservoir, Balal oil Field, Persian Gulf

Permeability prediction problem has been examined using several methods such as empirical formulas, regression analysis and intelligent systems especially neural networks and fuzzy logic. This study proposes an improved and novel model for predicting permeability from conventional well log data. The methodology is integration of empirical formulas, multiple regression and neuro-fuzzy in a commi...

متن کامل

Heterogeneous Gene Data for Classifying Tumors

When classifying tumors using gene expression data, mining tasks commonly make use of only a single data set. However, classification models based on patterns extracted from a single data set are often not indicative of an entire population and heterogeneous samples subsequently applied to these models may not fit, leading to performance degradation. In short, it is not possible to guarantee th...

متن کامل

Facies Modeling of Heterogeneous Carbonates Reservoirs by Multiple Point Geostatistics

Facies modeling is an essential part of reservoir characterization. The connectivity of facies model is very critical for the dynamic modeling of reservoirs. Carbonate reservoirs are so heterogeneous that variogram-based methods like sequential indicator simulation are not very useful for facies modeling. In this paper, multiple point geostatistics (MPS) is used for facies modeling in one of th...

متن کامل

Selection of Variables that Influence Drug Injection in Prison: Comparison of Methods with Multiple Imputed Data Sets

Background: Prisoners, compared to the general population, are at greater risk of infection. Drug injection is the main route of HIV transmission, in particular in Iran. What would be of interest is to determine variables that govern drug injection among prisoners. However, one of the issues that challenge model building is incomplete national data sets. In this paper, we addressed the process ...

متن کامل

Content Profiling for Preservation: Improving Scale, Depth and Quality

Content profiling in digital preservation is a crucial step that enables controlled management of content over time. However, large-scale profiling is facing a set of challenges. As data grows and gets more diverse, the only option to control it is to combine outputs of multiple characterization tools to cover the varieties of formats and extract features of interest. This cooperation of tools ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017